DOCUHENT RESOHE 



BD 091 <I12 



TH 003 6211 



AUTHOR 
TITLE 



PUB DATE 
NOTE 



SDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Katzenoieyer, Conrad 3.; And Others 

A Model for the Development and Evaluation of 

Placement Tests for Objective Based Curriculum 

Hanagement Systems. 

Apr 7a 

13p.; Paper presented at American Educational 
Research Association Annual Meeting (Chicago , 
Illinois, April 15-19, 1974) 

nF-$0.75 HC-$1.50 PLUS POSTAGE 

Diagnostic Tests; Educational Diagnosis; Elementary 

School Students; Grouping (Instructional Purposes) ; 

Individualized Instruction ; ^Models ; ^Placement; 

♦Test Construction; ♦Tests 

HDRSD; Wisconsin Design for Reading Skill 

Development 



ABSTRACT 

In many objective-based curriculum management 
systems, students* curricular activities are carefully directed by 
their own performance through extensive pretesting. When implementing 
such programs, however, there are often only rough criteria for 
appropriate leveling of students, necessitating extensive retesting. 
This paper outlines a model for the development and evaluation of a 
placement test for the Word attack area of the Wisconsin Design for 
Reading Skill Development. A thirty-item placement test vas 
constructed and tried out in two elementary schools prior to program 
implementation. Development strategies and effectiveness of the 
placement test in minimizing leveling errors are discussed. 
(Author) 



ERIC 



A MODEL FOR THK DE-VliLOPMKNT AND KVALUATTON OF HLACI-MENT TICSTS FOR 
OBJECTIVE BASED CURRICULUM MANAGEMENT SYSTEMS 



Conrad G. Katzerimeyer, Ph.D. 

R&D Center for Cognitive I.ciirning 

University of Wisconsin 



Deborah M. Stewart 

R&D Center for Cognitive l.oarnlnK 
UnivefMity of Wisconsin 



Mary R. Quilling 
University of Massachusetts 



In many object I ve-baacd curriculum management systems, students* 
currlcular activities are carefully directed by their own performance 
through extensive preiestingt When implementing; such programs, however, 
there are often only rough criteria for appropriate leveling of students, 
necessitating extensive retesting. This paper outlines a model for the 
development and evaluation of a placement test for the Word Attack area 
of the Wisconsin Design for Reading Skill Development. A tlil rty-i torn 
placement test was constructed and tried out in two elementary scliools 
prior to program implementation. Development strategies and effectiveness 
of the placement test In minimising leveling errors art.' discussed. 



Paper presented at AERA Meeting, Chicago, April, 197A. 



U^ OtPA»TMCNTOFME*LTM 

NATtONAL iNSTirurr Ot 

COUCAnON 



> ,M I "AC ^ » n ', \ i> ' ' ''V 



Ob joot f vi»s 

Ob Icctlvcs-bascd ciirrlculn aiul cMirriculum mauagcmeiU systems arc 
becoming more common as schools move toward indl vidua 1 i/,at ton and 
competency^based education. In curriculum managcmeiit systems of thin 
nature, such as the Wisconsin Design for Reading Skill Development 
(Otto and Askov, 1972) ♦ a student's curricular art IvI ties are carefully 
directed by his own performance, usually Lhrough an exLonslve protest I nj; 

procedure. Yet when such a management is first Implemented In 
a school, there are often only age-grade and teacher judgment criteria 
to guide initial placement, which can lead to a large retestlng rate 
which results in a considerable loss of time and resources. 

The purpose of this paper is to discuss and evaluate a preliminary 
model for the development of a curriculum placement test ttuit will aid 
in initial leveling of students prior to the protesting or "break-in" 
testing at any level. A successful placement test should be able to 
substantially lower the retesting rate indigenous to the system. 

Theorct lea I Kramework 

The unique and I'rltical rol(» of placement tests is recogni/.ed In 
models of educational assessment (Hlllson and Bongo, 1971). They are 
a type of diagnostic test used for determining the degree of mastery 
of program objectives already attained (Bloom, Hastings, and Madaus, 
1971). A placement test must accurately reflect the program's objectives, 
yet must cover a wid^r range of those objectives than any specific level 
of the program would contain. As such they represent the first stage 
of a two stage sequential jsting strategy (Cronbach and (Ueser, 196")). 

A particular problem for many plai*enient tests ocrurs when ti)e 
base rate of correct placement without the test oxt'eeds 50%. As discussed 
by Meehl and Rosen (1955), placement tests must he extremely valid to 
be useful when the behavior to be predicted is already reasonably well 



-2- 

predicted without the test* In this study, correct placement without 
the test Is approximately 75%. 

Methodology 

To be successful, a placement test must provide a scoring system 
that will minimize the number of students requiring retestlng at a higher 
or lower level, but must be brief and easily admlnlsterable. The first 
step was to construct a short test representing diverse elements of 
the AO battery tests of the WDRSD Word Attack area. As the Word Attack 
battery ranges in difficulty from Level A (Prereadlng level) to Level D 
(completion of word attack skills) no student takes the full battery 
of 40 tests in any single year. Therefore, It was necessary to eviiluate 
tests at a given level and combine by extrapolation to a scaling model. 
Tests at each level were examined to see which were the most successful 
in predicting the decision of correct or Incorrect placement. Then 
items within each scale were evaluated in regard to the same decision. 
In addition, scales were included that reflect a representative sample 
of phonic and structural strands within Word Attack. The final placement 
test consisted of 30 items - 5 items from a Level A test, 10 from two 
Level B tests, 10 from two Level C tests, and 5 from a Level D test. 
The means of these tests formed a scaling pattern, and items were 
chosen to retain this scaling pattern with the shorter subtests. 

Data Source 

In the Spring of 1973 the Word Attack Placement Test was administered 
to all students In two elementary schools (a suburban school in Wisconsin 
and an urban school In California), prior to break-in testing for the 
Wisconsin Design. These schools were not allowed to use the placement 

ERIC 



test information for Initial lov(»ling. Students wore scored for 
mastery (80%) as wclJ as total mcdto on each sub^cnlt. 

Results 

Placement Test 

Means, standard deviations and reliabilities for the six sub- 
scales are given in Table 1. Separate results are listed for grades 
K-4 and grades 1-4, as Kindergarteners were not required to take the 
final three subscales unless they could read. The Kindergartener!! scores 
were retained for pattern analysis, however, thus inflating reliability 
estimates • 

Pattern analysis according to mastery scores of the six subscales 
is given in Tables 2 and 3. For the total sample, 639 of 776 or 82.34% 
of the students conformed to the scale pattern expected. Of the non-scale 
patterns found (see Table 3) 10 ol the Kindergarteners failed to master 
subscalo 1 but did mastery subscalo 2 (pattern //I) and 33 of the students 
mastered all subscales but Subscale 5 (pattern //20) . Uased on this 
information and the fact that neither Subscale 1 or Subscaie 6 discriminated 
well in the range of students available for this study, both subscales 
were droppeu from further analysis. This eliminated non-scale patterns 
1, 3, 18 and 20, and left 716 of 776 or 92.26% of the students conforming 

to the scale patterns. 
Relationship to Break-in Testin g. 

Results of the break-in tests for the full battery, using standard, 
non-placement test guidelines, aro givoa in Tabic 4. A student considered 
inappropriately leveled if he masters 0 or only I scale at a level (test down) 
or masters all or all but 1 test at .» level (tost up). The overall error 
for initial placement of 26.6% was very close to the expcu'ted 25%, wltli 

ERIC 



-4- 

somcwhnt larger orror ratos at the lowor battcjrv h»vels. One difficulty 
that surfaced at this time in rcj;ard to this sample was ilial almost all 
inappropr lal o p lac(?m(.Mit s were^'tcst ups," This was undoubtedly due lo the 
fact that the break-in testing occurred late In the school year; a different 
pattern of errors would be expected if break-In had occurred In the fall. 
Therefore, the results obtained in this study apply only to Spring implementation 
of Word AttacL and the study will need to be repeated next fall. 

Comparisons of the placement test results with the full-battery 
results are given in Table 5. Conditional probabilities were computed 
separately for the appropriately and inappropriately placed students 
at each test level. Predictions from the Placement 'lest were run for 
each subscalo, combinations of subscnles, and for tesl score totals in 
addition to combinations. In order to simplify the table, only subscales 
and subscale combinations are listed. 

Results indicate that the Placement Test could markedly improve 
placement at Level A, using mastery of subscale 2 only. Level AB 
was totally unpredictable; the Placement Test Subscale 3 had negative 
discrimination at this level. Detailed analysis of the sample suggested' 
that the problem arose in two classes of Kindergarteners where a number 
of students could master enough tests at A-B to bo judged test-ups, 
but had not had the material in Subscale 3 and could not master it. 

Further work will need to be done at A-B in order to make the Placement 
Test useful . 

Predictions at Levels B and C were somewhat above the base rates, 
when total test scores were included with subscale m.Js:tory scores. The 
results were not particularly striking, however, and it seems likely that 
at these levels predictions will have to be made in one direction onJy. 



The Placement Tost, will probably he rerommonded as n threshold vnriable; 
if a Hluclonl doos not acihlove maslorv ol a numbcM' of sttbHraleH and/or a 
total score of a certain level, a prediction can he made that the 
student should not be tested up. However, a score at or above the 
levels set will have to be interpreted as a sign to consider further 
information before testing up. 

Due to the base rate of appropriate placement, Level D was not 
predictable from the Placement Test. This is not a serious problem 
as the decision to test up from Level D means that the child has 
completed all Word Attack skills, and thus is the type of decision 
which should be made from the total battery, not from a Placement Test. 

Implications 

Based on the data collected, the model for developing placement 
tests for objectives-based curriculum management systems presented here 
has been reasonably successful. Predictions for three of the five curriculum 
levels were better than the base rate, although the results obtained in 
this study need cross-validation and further investigation at another 
time in the school year. Further, the Placement Test was developed with 
no additional test construction and proved to bo a ^^uod instrument, both 
in terms of internal characteristics and in scale patterns. It may well 
be that the scale distances of the subscales need to bo adjusted, as 
tliere was more than the expected mastery overlap for the two most difficult 
scales, yet the Placement Test may well need somewhat greater "top** than 
was available with only four subscales. 

Possibly the most important finding in this study was that the Placement 
Test could provide highly accurate information in only one direction. Given 
the base rate of approximately 75%, the Placement Tost was quite effective 



T:\b\v '> 
Niiml)iM' of St udcnl .s 
Conforming to Sea to PntternH 
(O=nonmnstery l«mastery) 



Scale Patterns Number of Sttidents Percent of Students 



cooooo 


22 


2.84 


100000 


59 


7.61 


110000 


93 


11.98 


lUOOO 


lf)6 


13.65 


11 lion 


143 


18.68 


lllllO 


HA 


10.83 


UllU 


130 


16.75 


Total 


639 


82.34 



'IMI)1l> 1 

Means, St/indard Dovl/Jt ions and 
Rciiabll Itios of t he Subscalcs of 
clie Word Attack Plnt'cmonc Test 



Subscalc Mean 



A -1.82 4.94 .60 

B 4.45 4.72 1.15 

C 3.82 4.35 1.69 

D 3.20 3.79 2.04 

E 2.27 2.69 1.82 

F 2.06 2.44 1.79 

Total 20.61 22.95 7.40 



indard Reliability 
/lat Ion 

1-4 K-4 1-4 



.32 


.64 


.55 


.72 


.79 


.63 


1. 16 


.86 


.73 


1.63 


.91 


.83 


1.67 


.79 


.71 


1.69 


. 78 


.71 


5.36 


.94 


' .88 



ERIC 



Table J 



Number of StiidtMUs by Orado Having Nonucalo 
Patterns for the Word Attack PlaccTOcnt Test 



Nonscalc Patterns 



Total 



— - 


— • — . — ^ — 










■• 





•-- 


1 

L • 


UiUVJUU 


1 n 

Iv) 


2 


1 






13 


•> 


n 1 n 1 nn 
U I UlUU 








1 
I 






-> 
) • 


mil nn 




1 












r\c\c\ \ i\\ 








1 






r 

:> . 


lUUiUU 




1 

1 




1 






n , 


1 /"^ 1 n n n 




1 










"7 

/ • 


1 nn 111 






I 








u 

o . 


1 n 1 n 1 n 






1 








V . 


1 n 1 1 nn 
LU L lUU 






•> 

3 








LU • 


1 n 1 1 In 
1 0 1 i J 0 




1 


2 








L 1 • 


t n 1 1 n t 
101 101 










1 






WA 1 1 1 1 

1 0 1 1 1 1 










I 






110010 




1 












llOJOO 




6, 


1 


) 


1 


11 


15. 


IlOllO 






A 


1 


2 




U>. 


IIOIOI 








5 


2 




17. 


iioni 








•I 
) 


'I 




18. 


UlOOl 




2 


4 


2 






19. 


IIIDIO 




3 


5 


1 






20. 


111101 






15 




3 


55 






10 


21 


Al 


32 


13 


137 


Total // Taking Test 


122 


221 


186 


1 9 a 


49 


776 


% of Nonscale Patterns 


8.2 


5 


22.0 


26.3 


26.!-) 


17.6 



Kosiilis of Ful 1 Battery 
Hrce'ik-in Testing 



Appropriately Inappropr lately Z 

Placed Placed Appropriate 



A 


58 


AI 


58. f. 


AR 


1 ? 


If. 




% 


107 


38 


73.8 


C 


I5A 


56 


73.3 


D 


99 


5 


95.2 


Total 


4 JO 


156 


73.4 



ERIC 



I 







'A 






V 






3 






o- 






•H 






c 




U 






tc 








^ ^ a; 




0. 


« H H 












Wi c c 










0 


0 6 






U 0 0) 




a. 


&.c/^ o 








in 


0 


c (Q a. 


O 


W 


•u > P 






X 




c ^ «^ 


















>. M < 






•-^ C 












u (0 w 






(9 S 0 












U (A 






a w 0) 

OCX 






ki Q> U 






a T3 




e 


a S Wi 




0 


< w 0 




•H 






4J 
»H 






-o 






e 






o 






u 





ERIC 



C 

o 



a; 

X 
3 

V3 



C 

B 
0) 
U 
C5 



OS 



X 



M 



!5 



o 
o 



-9 

wO o o 
moo 



o o 
o o 



o o 

00 



o 
o 



C 

^ o 



00 



in o 
CN o 



CM 



O 
O 



^ a. 

« C 



m O CM 
O O 



CN 00 



O 

CN ON 



o 
o 



<r o 
^ o 



o o 
moo 
Kn • • 



o o 
o o 



CM o 
O CO 



m o 



o o 
o o 



r>. o 

NO O 



m ON 

>J CO 



O O 

CN o 



CM O 

^ o 



O CM 
O O 



CO 00 

cn m 



ON o 

O 



o o 
^ o 



ON O 

o o 



CM ^ 
V5 r-* 



O 

•H O 



(M 

O ON 



O 



vC O 

O o 



00 w 

in 



0) u 



ca 

a 
o 
w 
a 



u 
a 
o 

Wi 

a 
a 
(9 



r9 
a 

2 

a 
a 



BQ 





























r* 




m 




in 


o 




lA 




ON 


w 










On 






1i 




0) 




0) 




u 




u 




u 
















*^ 






4J 


•H 


ri 


u 




u 


eg 


W 




a 


•H 


a 




Pu 


u 


0 


u 


0 


u 


J 


a 


Wi 


a 


Wi 


cu 


Wi 


o 


a 


0 


a 


O 


a 




a 




a 


u 


a 






Q. 


% 




(Q 




c 


CI 


c 




C 






< 


M 


< 


M 










a 





in providing a throBhohi lovol bolow wlilcli It could bt< suld with o.onald«rabl« 
certainty that tho Ht.udoiu w.ih propurly Icvolod, but the decision to "tUHt 
up" whim the tliroHlu)hl Hcoro wrt« exceeded could not b»' made with Himllar 
accuracy. It rcmalnH to bo Hcon wliptlier this nption of the placement 
test a« a threshold me.usnre will b« 8upportod.;ln a new sample that contain« a 
larger portion of "test downs". 



o 

ERIC 



Bloom, B., HflHtingH, T. ♦ and M/itlnuHi (I. Hnndbook of Formative and Summa tXyp 
'tVALl^Ji*^iil"..~^.L A^^l^ ':^.0TJ1.*:J1B Chlr/iRo: McGraw-Hlli, 1971 . 

Crohbach* J. and (Jlcsots (J. Psycrhologteal T osts and P ersonnel DcclHionS i 
Urbana: University of Illinois PresH, 1965. 

Hillsoiu M. and Bongo, J. Cont In iiouH-Prog ress Educa t Ion . Palo Alto: 
Science Research Associates, 1971. 

Meehl, P. K. and Hoson, A. Antecedent probability and the efficiency of 
psychometric sir,ns, patterns or cutting scores, l^ sychn loglcal Hull fitin ; 
1955, 52, 194-216. 

DlLo, W. and Askov, T.. The Wisconsin^ l^eF^jjnior.^.f^^^^^^ 

Rationale and Guidelines. MinneapoUfl: National Computer Systems, 1972. 



